Targeted Gene Metagenomic Data Analysis ◾ 267
and barcode FASTQ files). We can import such multiplexed raw data to QIIME2 and later
demultiplex them using a QIIME2 command. To import a single-end EMP-multiplexed
reads, the forward FASTQ file name must be “sequences.fastq.gz” and the barcode file is to
be “barcodes.fastq.gz”. These two files can be in a directory say “data” for example. Then,
you can use “qiime tools import” to import the raw data using the following:
qiime tools import \
--type EMPSingleEndSequences \
--input-path data \
--output-path artifacts/multiplexed-emp-single-end.qza
Notice that we use “input-path” to specify the directory where the two files are found.
For the paired-end EMP-demultiplexed raw data, we can use the following:
qiime tools import \
--type EMPPairedEndSequences \
--input-path data \
--output-path artifacts/multiplexed-emp-paired-end.qza
Notice that both artifacts created by the above commands are for multiplexed reads. Those
artifacts required demultiplexed as we will discuss later.
7.3.1.1.3 Importing non-EMP-Multiplexed FASTQ Files
In the non-EMP-multiplexed reads, the barcode sequences are attached to the reads. This
read format will have a single FASTQ file (“sequences.fastq.gz”) for the single-end reads
and two files (“forward.fastq.gz” and “reverse.fastq.gz”) for the paired-end reads. The fol-
lowing are the commands to import single-end and paired-end raw data, respectively:
qiime tools import \
--type MultiplexedSingleEndBarcodeInSequence \
--input-path data/sequences.fastq.gz \
--output-path artifacts/multiplexed-single-end.qza
qiime tools import \
--type MultiplexedPairedEndBarcodeInSequence \
--input-path data \
--output-path artifacts/multiplexed-paired-end.gza
Again, the artifacts created by the above commands contain multiplexed reads. So, they
must be demultiplexed before proceeding. Demultiplexing will be discussed later.
7.3.1.1.4 Importing Casava Demultiplexed FASTQ Files
The Casava 1.8 demultiplexed formatted FASTQ file includes a barcode as the sample iden-
tifier in the FASTQ file name, which is made of four IDs separated by underscore: the
sample identifier, barcode identifier, slide lane number, direction of the read, and the set